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GROUP B STREPTOCOCCUS 

This application incorporates by reference the contents of each of two duplicate CD-ROMs 
which contain an identical 90.1 MB file labeled "PP28007 PCT sequence listing.txt," which is the 
sequence listing for this application. The CD-ROMs were created on December 21, 2005. This 
5 application also incorporates by reference the contents of each of two duplicate CD-ROMs which 
contain an identical 681 KB file labeled "Table I.txt" and containing Table I. The CD-ROMs were 
created on December 20, 2005. 

All documents cited herein are incorporated by reference in their entirety. 

TECHNICAL FIELD 

10 This invention is in the field of Streptococcus biology, and in particular relates to S.agalacticfe, also 
known as c group B streptococcus' (GBS). 

BACKGROUND ART 

Once thought to infect only cows, the Gram-positive bacterium Streptococcus agalactiae (or "group 
B streptococcus", abbreviated to "GBS") is now known to cause serious disease, bacteremia and 

15 meningitis, in immunocompromised individuals and in neonates. There are two types of neonatal 
infection. The first (early onset, usually within 5 days of birth) is manifested by bacteremia and 
pneumonia. It is contracted vertically as a baby passes through the birth canal. GBS colonises the 
vagina of about 25% of young women, and approximately 1% of infants born via a vaginal birth to 
colonised mothers will become infected. Mortality is between 50-70%. The second is a meningitis 

20 that occurs 10 to 60 days after birth. If pregnant women are vaccinated with type III capsule so that 
the infants are passively immunised, the incidence of the late onset meningitis is reduced but is not 
entirely eliminated. 

The "B" in "GBS" refers to the Lancefield classification, which is based on the antigenicity of a 
carbohydrate which is soluble in dilute acid and called the C carbohydrate. Lancefield identified 13 

25 types of C carbohydrate, designated A to O, that could be serologically differentiated. The organisms 
that most commonly infect humans are found in groups A, B, D, and G. Within group B, strains can 
be divided into 8 serotypes (la, lb, Ia/c, II, III, IV, V, and VI) based on the structure of -their 
polysaccharide capsule. The genome sequence of a serotype Vstr^n ofGBS has been published and 
analysed [1,2], including a comparative genome hybridization analysis of 19 disease-causing isolates 

30 of the same type V strain 2603 V/R. The genome sequence of a serotype III strain is also known [3]. 

Current GBS vaccines are based on polysaccharide antigens, although these suffer from poor 
immunogenicity. Anti-idiotypic approaches have also been used {e.g. ref. 4). There remains a need, 
however, for effective adult vaccines against S.agalactiae infection. 

It is an object of the invention to provide proteins which can be used in the development of such 
35 vaccines. The proteins may also be useful for diagnostic purposes, and as targets for antibiotics. 
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DlSdtdgfURE T)F T HE INDENTION 
Polypeptides 

The invention provides polypeptides comprising the GBS amino acid sequences disclosed in the 
examples. These amino acid sequences are the even SEQ ID NOs between 2 and 22740. There are 
5 thus 11370 amino acid sequences. The polypeptides encoded by sequences listed in Table IV have 
not previously been seen in GBS strains. 

The invention also provides polypeptides comprising amino acid sequences that have sequence 
identity to the GBS amino acid sequences disclosed in the examples. Depending on the particular 
sequence, the degree of sequence identity is preferably greater than 50% (e.g. 60%, 70%, 75%, 80%, 

10 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more). These polypeptides include 
homologs, orthologs, allelic variants and functional mutants. Typically, 50% identity or more 
between two polypeptide sequences is considered to be an indication of functional equivalence. 
Identity between polypeptides is preferably determined by the Smith- Waterman homology search 
algorithm as implemented in the MPSRCH program (Oxford Molecular), using an affine gap search 

1 5 with parameters gap open penalty =12 and gap extension penalty =1 . 

These polypeptide may, compared to the GBS sequences of the examples, include one or more (e.g. 
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) conservative amino acid replacements i.e. replacements of one amino 
acid with another which has a related side chain. Genetically-encoded amino acids are generally 
divided into four families: (1) acidic i.e. aspartate, glutamate; (2) basic i.e. lysine, arginine, histidine; 

20 (3) non-polar i.e. alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan; 
and (4) uncharged polar i.e. glycine, asparagine, glutamine, cysteine, serine, threonine, tyrosine. 
Phenylalanine, tryptophan, and tyrosine are sometimes classified jointly as aromatic amino acids. In 
general, substitution of single amino acids within these families does not have a major effect on the 
biological activity. The polypeptides may have one or more (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) 

25 single amino acid deletions relative to the GBS sequences of the examples. The polypeptides may 
also include one or more (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) insertions (e.g. each of 1, 2, 3, 4 or 5 
amino acids) relative to the GBS sequences of the examples. Some of these deletions, insertions or 
substitutions may convert one sequence of the invention to another sequence of the invention e.g. 

.... . _amdno acids i8_0-23Q_of SEQ ID.N.Q; 86 14. (identical to amina acids, 173-223 of SEQ ID NO: 14060, 

30 and amino acids 4-54 of SEQ ID NO: 3916) become amino acids 180-230 of SEQ ID NO: 12908 by 
conservative substitution of Ile-185 for Val. 

Preferred polypeptides of the invention are listed below, including polypeptides that are lipidated, 
that are located in the outer membrane, that are located in the inner membrane, or that are located in 
the periplasm. Particularly preferred polypeptides are those that fall into more than one of these 
35 categories e.g. lipidated polypeptides that are located in the outer membrane. Lipoproteins may have 
a N-terminal cysteine to which lipid is covalently attached, following post-translational processing of 
the signal peptide. 
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Th& invention ftiMe^ comprising fragments of the GBS amino acid sequences 

disclosed in the examples. The fragments should comprise at least n consecutive amino acids from 
the sequences and, depending on the particular sequence, n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 
20, 22, 24, 26, 28, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or more). 

The fragment may comprise at least one T-cell or, preferably, a B-cell epitope of the sequence. T- 
and B-cell epitopes can be identified empirically (e.g. using PEPSCAN [5,6] or similar methods), or 
they can be predicted (e.g. using the Jameson- Wolf antigenic index [7], matrix-based approaches [8], 
TEPITOPE [9], neural networks [10], OptiMer & EpiMer [11,12], ADEPT [13], Tsites [14], 
hydrophilicity [15], antigenic index [16] or the methods disclosed in reference 17, etc.). Other 
preferred fragments are (a) the N-terminal signal peptides of the GBS polypeptides of the invention, 
(b) the GBS polypeptides, but without their N-terminal signal peptides, (c) the GBS polypeptides, but 
without their N-terminal amino acid residue. 

Further preferred fragments are those common to at least two (e.g. 2, 3, 4 or 5) homologous coding 
sequences, and in particular those common to homologous coding sequences within the sequence 
listing. Table II shows homologous SEQ ID numbers for nucleic acids within the sequence listing 
e.g. SEQ ID NOs: 88, 4374, 8834, 13214 and 17994 are homologous within the sequence listing, and 
are also homologous with prior art GI sequences 22533036 and 23094457. Simple alignments show 
that amino acids 1-131 of these five SEQ ID NOs are common, as are amino acids 133-176, 178-182, 
184-190, 192-217, 219-250, 252-278, 280-322, 324-366, 368-373 and 375-434. Similarly, 1-176 are 
common to SEQ ID NOs: 88, 4374, 8834 and 13214, but not to 17994. Thus fragments 1-131, 1-176 
and 133-176 are all preferred fragments of the invention. In some cases, where homologous 
sequences are 100% identical between strains along their complete lengths (e.g. SEQ ID NOs: 2, 
8616, 12910, 14062 and 22384), the common 'fragment' will in fact be the complete sequence. 

Other preferred fragments are those that begin with an amino acid encoded by a potential start codon 
(ATG, GTG, TTG). Fragments starting at the methionine encoded by a start codon downstream of 
the indicated start codon are polypeptides of the invention. 

Polypeptides of the invention can be prepared in many ways e.g. by chemical synthesis (in whole or 
in part), by digesting longer polypeptides using proteases, by translation from RNA, by purification 
-from -cell culture~(e.g. -from recombinant expression), from the organism -itself (e.g. after bacterial 
culture, or direct from patients), etc. A preferred method for production of peptides <40 amino acids 
long involves in vitro chemical synthesis [18,19]. Solid-phase peptide synthesis is particularly 
preferred, such as methods based on tBoc or Fmoc [20] chemistry. Enzymatic synthesis [21] may 
also be used in part or in full. As an alternative to chemical synthesis, biological synthesis may be 
used e.g. the polypeptides may be produced by translation. This may be carried out in vitro or in vivo. 
Biological methods are in general restricted to the production of polypeptides based on L-amino 
acids, but manipulation of translation machinery (e.g. of aminoacyl tRNA molecules) can be used to 
allow the introduction of D-amino acids (or of other non natural amino acids, such as iodotyrosine or 
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methyiphenylalanine, azidohomoalanine, etc.) [22]. Where D-amino acids are included, however, it 
is preferred to use chemical synthesis. Polypeptides of the invention may have covalent 
modifications at the C-terminus and/or N-terminus. 

Polypeptides of the invention can take various forms (e.g. native, fusions, glycosylated, 
5 non-glycosylated, lipidated, non-lipidated, phosphorylated, non-phosphorylated, myristoylated, 
non-myristoylated, monomeric, multimeric, particulate, denatured, etc.). 

Polypeptides of the invention are preferably provided in purified or substantially purified form i.e. 
substantially free from other polypeptides (e.g. free from naturally-occurring polypeptides), 
particularly from other streptococcal or host cell polypeptides, and are generally at least about 50% 
10 pure (by weight), and usually at least about 90% pure i.e. less than about 50%, and more preferably 
less than about 10% (e.g. 5%) of a composition is made up of other expressed polypeptides. 
Polypeptides of the invention are preferably GBS polypeptides. Polypeptides of the invention 
preferably have the function indicated in Table I for the relevant sequence. 

Polypeptides of the invention may be attached to a solid support. Polypeptides of the invention may 
15 comprise a detectable label (e.g. a radioactive or fluorescent label, or a biotin label). 

The term "polypeptide" refers to amino acid polymers of any length. The polymer may be linear or 
branched, it may comprise modified amino acids, and it may be interrupted by non-amino acids. The 
terms also encompass an amino acid polymer that has been modified naturally or by intervention; for 
example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any 

20 other manipulation or modification, such as conjugation with a labeling component. Also included 
within the definition are, for example, polypeptides containing one or more analogs of an amino acid 
(including, for example, unnatural amino acids, etc.), as well as other modifications known in the art. 
Polypeptides can occur as single chains or associated chains. Polypeptides of the invention can be 
naturally or non-naturally glycosylated (i.e. the polypeptide has a glycosylation pattern that differs 

25 from the glycosylation pattern found in the corresponding naturally occurring polypeptide). 

Polypeptides of the invention may be at least 40 amino acids long (e.g. at least 40, 50, 60, 70, 80, 90, 
100, 120, 140, 160, 180, 200, 220, 240, 260, 280, 300, 350, 400, 450, 500 or more). Polypeptides of 
the invention may be shorter than 500 amino acids (e.g. no longer than 40, 50, 60, 70, 80, 90, 100, 
120, 140, 160, 180, 200, 220, 240, 260, 280," 300, 350, 400 or 450 amino acids). 

30 The invention provides polypeptides comprising a sequence -X-Y- or -Y-X-, wherein: -X- is an 
amino acid sequence as defined above and -Y- is not a sequence as defined above i.e. the invention 
provides fusion proteins. Where the N-terminus codon of a polypeptide-coding sequence is not ATG 
then that codon will be translated as the standard amino acid for that codon rather than as a Met, 
which occurs when the codon is translated as a start codon. 

35 The invention provides a process for producing polypeptides of the invention, comprising the step of 
culturing a host cell of to the invention under conditions which induce polypeptide expression. 
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ine mvenuon provides a process tor producing a polypeptide of the invention, wherein the 
polypeptide is synthesised in part or in whole using chemical means. 

The invention provides a composition comprising two or more polypeptides of the invention. 

The invention also provides a hybrid polypeptide represented by the formula NH 2 -A-[-X-L-]„-B- 
5 COOH, wherein X is a polypeptide of the invention as defined above, L is an optional linker amino 
acid sequence, A is an optional N-terminal amino acid sequence, B is an optional C-temiinal amino 
acid sequence, and n is an integer greater than 1. The value of n is between 2 and x, and the value of 
x is typically 3, 4, 5, 6, 7, 8, 9 or 10. Preferably n is 2, 3 or 4; it is more preferably 2 or 3; most 
preferably, n = 2. For each n instances, -X- may be the same or different. For each n instances of 

10 [-X-L-], linker amino acid sequence -L- may be present or absent. For instance, when n=2 the hybrid 
may be NH 2 -X 1 -L 1 -X 2 -L 2 -COOH, NH 2 -X r X 2 -COOH, NH^Xi-Lj-X^COOH, NH 2 -X r X 2 -L 2 - 
COOH, etc. Linker amino acid sequence(s) -L- will typically be short (e.g. 20 or fewer amino acids 
i.& 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples include short peptide 
sequences which facilitate cloning, poly-glycine linkers (i.e. G\y n where n = 2, 3, 4, 5, 6, 7, 8, 9, 10 

15 or more), and histidine tags (i.e. His w where n = 3, 4, 5, 6, 7, 8, 9, 10 or more). Other suitable linker 
amino acid sequences will be apparent to those skilled in the art. -A- and -B- are optional sequences 
which will typically be short (e.g. 40 or fewer amino acids i.e. 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 
29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). 
Examples include leader sequences to direct polypeptide trafficking, or short peptide sequences 

20 which facilitate cloning or purification (e.g. histidine tags i.e. His* where n = 3, 4, 5, 6, 7, 8, 9, 10 or 
more). Other suitable N-terminal and C-terminal amino acid sequences will be apparent to those 
skilled in the art. 

Various tests can be used to assess the in vivo immunogenicity of polypeptides of the invention. For 
example, polypeptides can be expressed recombinantly and used to screen patient sera by 
25 immunoblot. A positive reaction between the polypeptide and patient serum indicates that the patient 
has previously mounted an immune response to the protein in question i.e. the protein is an 
immunogen. This method can also be used to identify immunodominant proteins. 

Antibodies 

The invention provides antibodies that bind to polypeptides of the invention! These may be 
30 polyclonal or monoclonal and may be produced by any suitable means (e.g. by recombinant 
expression). To increase compatibility with the human immune system, the antibodies may be 
chimeric or humanised [e.g. refs. 23 & 24], or fully human antibodies may be used. The antibodies 
may include a detectable label (e.g. for diagnostic assays). Antibodies of the invention may be 
attached to a solid support. Antibodies of the invention are preferably neutralising antibodies. 

35 Monoclonal antibodies are particularly useful in identification and purification of the individual 
polypeptides against which they are directed. Monoclonal antibodies of the invention may also be 
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employee as reagents m immunoassays, radioimmunoassays (RIA) or enzyme-linked immunosorbent 
assays (ELISA), etc.. In these applications, the antibodies can be labelled with an analytically- 
detectable reagent such as a radioisotope, a fluorescent molecule or an enzyme. The monoclonal 
antibodies produced by the above method may also be used for the molecular identification and 
characterization (epitope mapping) of polypeptides of the invention. 

Antibodies of the invention are preferably specific to Streptococci i.e. they bind preferentially to 
Streptococci bacteria relative to non-Streptococci bacteria. More preferably, the antibodies are 
specific to GBS i.e. they bind preferentially to GBS bacteria relative to non-type-b streptococci. 

Antibodies of the invention are preferably provided in purified or substantially purified form. 
Typically, the antibody will be present in a composition that is substantially free of other 
polypeptides e.g. where less than 90% (by weight), usually less than 60% and more usually less than 
50% of the composition is made up of other polypeptides. 

Antibodies of the invention can be of any isotype (e.g. IgA, IgG, IgM i.e. an a, 7 or jlx heavy chain), 
but will generally be IgG. Within the IgG isotype, antibodies may be IgGl, IgG2, IgG3 or IgG4 
subclass. Antibodies of the invention may have a k or a X light chain. 

Antibodies of the invention can take various forms, including whole antibodies, antibody fragments 
such as F(ab') 2 and F(ab) fragments, Fv fragments (non-covalent heterodimers), single-chain 
antibodies such as single chain Fv molecules (scFv), minibodies, oligobodies, etc. The term 
"antibody" does not imply any particular origin, and includes antibodies obtained through 
non-conventional processes, such as phage display. 

The invention provides a process for detecting polypeptides of the invention, comprising the steps of: 
(a) contacting an antibody of the invention with a biological sample under conditions suitable for the 
formation of an antibody-antigen complexes; and (b) detecting said complexes. 

The invention provides a process for detecting antibodies of the invention, comprising the steps of: 
(a) contacting a polypeptide of the invention with a biological sample (e.g. a blood or serum sample) 
under conditions suitable for the formation of an antibody-antigen complexes; and (b) detecting said 
complexes. 

-For^ood cross-reactivity; preferred "antibodies of tfreinv^ 
are common to at least two (e.g. 2, 3, 4 or 5) homologous coding sequences, as described in more 
detail above. Conversely, for good specificity, other preferred antibodies of the invention bind to 
epitopes that include an amino acid that differs between homologous coding sequences e.g. binds to 
Phe-132 in SEQ ID NO: 17994 to distinguish from SEQ ID NOs: 88, 4374, 8834 and 13214, all of 
which have a Serine residue at position 132. 

Nucleic acids 

The invention provides nucleic acid comprising the GBS nucleotide sequences disclosed in the 

examples. These nucleic acid sequences are the odd SEQ ID NOs between 1 and 22739. 
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The invention also provides nucleic acid comprising nucleotide sequences having sequence identity 
to the GBS nucleotide sequences disclosed in the examples. Identity between sequences is preferably 
determined by the Smith- Waterman homology search algorithm as described above. 

The invention also provides nucleic acid which can hybridize to the GBS nucleic acid disclosed in 
5 the examples. Hybridization reactions can be performed under conditions of different "stringency". 
Conditions that increase stringency of a hybridization reaction of widely known and published in the 
art l e -g- page 7.52 of reference 25]. Examples of relevant conditions include (in order of increasing 
stringency): incubation temperatures of 25°C, 37°C, 50°C, 55°C and 68°C; buffer concentrations of 
10 x SSC, 6 x SSC, 1 x SSC, 0.1 x SSC (where SSC is 0.15 M NaCl and 15 mM citrate buffer) and 
10 their equivalents using other buffer systems; formamide concentrations of 0%, 25%, 50%, and 75%; 
incubation times from 5 minutes to 24 hours; 1, 2, or more washing steps; wash incubation times of 
1, 2, or 15 minutes; and wash solutions of 6xSSC, lxSSC, 0.1 x SSC, or de-ionized water. 
Hybridization techniques and their optimization are well known in the art [e.g. see refs 25-28, etc.]. 

In some embodiments, nucleic acid of the invention hybridizes to a target of the invention under low 
15 stringency conditions; in other embodiments it hybridizes under intermediate stringency conditions; 
in preferred embodiments, it hybridizes under high stringency conditions. An exemplary set of low 
stringency hybridization conditions is 50°C and 10 x SSC. An exemplary set of intermediate 
stringency hybridization conditions is 55°C and 1 x SSC. An exemplary set of high stringency 
hybridization conditions is 68°C and 0.1 x SSC. 

20 Nucleic acid comprising fragments of these sequences are also provided. These should comprise at 
least n consecutive nucleotides from the GBS sequences and, depending on the particular sequence, n 
is 10 or more {e.g. 12, 14, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). 

The invention provides nucleic acid of formula 5 ! -X-Y-Z-3', wherein: -X- is a nucleotide sequence 
consisting of x nucleotides; -Z- is a nucleotide sequence consisting of z nucleotides; -Y- is a 
25 nucleotide sequence consisting of either (a) a fragment of one of the odd-numbered SEQ ID NOs: 1 
to 22739, or (b) the complement of (a); and said nucleic acid 5*-X-Y-Z-3 f is neither (i) a fragment of 
one of the odd-numbered SEQ ID NOs: 1 to 22739 nor (ii) the complement of (i). The -X- and/or -Z- 
moieties may comprise a promoter sequence (or its complement). 

The invention also provides nucleic acid encoding the polypeptides and polypeptide fragments of the 
30 invention. 

The invention includes nucleic acid comprising sequences complementary to the sequences disclosed 
in the sequence listing (e.g. for antisense or probing, or for use as primers), as well as the sequences 
in the orientation actually shown. 

Nucleic acids of the invention can be used in hybridisation reactions (e.g. Northern or Southern blots, 
35 or in nucleic acid microarrays or 'gene chips') and amplification reactions (e.g. PCR, SDA, SSSR, 
LCR, TMA, NASBA, etc.) and other nucleic acid techniques. 
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JNucieic acid according to the invention can take various forms (e.g. single-stranded, double-stranded, 
vectors, primers, probes, labelled etc.). Nucleic acids of the invention may be circular or branched, 
but will generally be linear. Unless otherwise specified or required, any embodiment of the invention 
that utilizes a nucleic acid may utilize both the double-stranded form and each of two complementary 
5 single-stranded forms which make up the double-stranded form. Primers and probes are generally 
single-stranded, as are antisense nucleic acids. 

Nucleic acids of the invention are preferably provided in purified or substantially purified form i.e. 
substantially free from other nucleic acids (e.g. free from naturally-occurring nucleic acids), 
particularly from other streptococcal or host cell nucleic acids, generally being at least about 50% 
10 pure (by weight), and usually at least about 90% pure. Nucleic acids of the invention are preferably 
GBS nucleic acids. 

Nucleic acids of the invention may be prepared in many ways e.g. by chemical synthesis (e.g. 
phosphoramidite synthesis of DNA) in whole or in part, by digesting longer nucleic acids using 
nucleases (e.g. restriction enzymes), by joining shorter nucleic acids or nucleotides (e.g. using ligases 
15 or polymerases), from genomic or cDNA libraries, etc. 

Nucleic acid of the invention may be attached to a solid support (e.g. a bead, plate, filter, film, slide, 
microarray support, resin, etc.). Nucleic acid of the invention may be labelled e.g. with a radioactive 
or fluorescent label, or a biotin label. This is particularly useful where the nucleic acid is to be used 
in detection techniques e.g. where the nucleic acid is a primer or as a probe. 

20 The term "nucleic acid" includes in general means a polymeric form of nucleotides of any length, 
which contain deoxyribonucleotides, ribonucleotides, and/or their analogs. It includes DNA, RNA, 
DNA/RNA hybrids. It also includes DNA or RNA analogs, such as those containing modified 
backbones (e.g. peptide nucleic acids (PNAs) or phosphorothioates) or modified bases. Thus the 
invention includes mRNA, tRNA, rRNA, ribozymes, DNA, cDNA, recombinant nucleic acids, 

25 branched nucleic acids, plasmids, vectors, probes, primers, etc.. Where nucleic acid of the invention 
takes the form of RNA, it may or may not have a 5' cap. 

Nucleic acids of the invention comprise GBS sequences, but they may also comprise non-GBS 
sequences (e.g. in nucleic acids of formula 5'-X-Y-Z-3 f , as defined above). This is particularly useful 
for primers, which may thus comprise a first sequence complementary to a GBS nucleic acid target 
30 and a second sequence which is not complementary to the nucleic acid target. Any such 
non-complementary sequences in the primer are preferably 5 ! to the complementary sequences. 
Typical non-complementary sequences comprise restriction sites or promoter sequences. 

Nucleic acids of the invention can be prepared in many ways e.g. by chemical synthesis (at least in 
part), by digesting longer nucleic acids using nucleases (e.g. restriction enzymes), by joining shorter 
35 nucleic acids (e.g. using ligases or polymerases), from genomic or cDNA libraries, etc. 
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Nucleic acids of the invention may be part of a vector i.e. part of a nucleic acid construct designed 
for transduction/transfection of one or more cell types. Vectors may be, for example, "cloning 
vectors" which are designed for isolation, propagation and replication of inserted nucleotides, 
"expression vectors" which are designed for expression of a nucleotide sequence in a host cell, "viral 
5 vectors" which is designed to result in the production of a recombinant virus or virus-like particle, or 
"shuttle vectors", which comprise the attributes of more than one type of vector. Preferred vectors 
are plasmids. A "host cell" includes an individual cell or cell culture which can be or has been a 
recipient of exogenous nucleic acid. Host cells include progeny of a single host cell, and the progeny 
may not necessarily be completely identical (in morphology or in total DNA complement) to the 
10 original parent cell due to natural, accidental, or deliberate mutation and/or change. Host cells 
include cells transfected or infected in vivo or in vitro with nucleic acid of the invention. 

Where a nucleic acid is DNA, it will be appreciated that "U" in a RNA sequence will be replaced by 
"T" in the DNA. Similarly, where a nucleic acid is RNA, it will be appreciated that "T" in a DNA 
sequence will be replaced by "U" in the RNA. 

1 5 The term "complement" or "complementary" when used in relation to nucleic acids refers to Watson- 
Crick base pairing. Thus the complement of C is G, the complement of G is C, the complement of A 
is T (or U), and the complement of T (or U) is A. It is also possible to use bases such as I (the purine 
inosine) e.g. to complement pyrimidines (C or T). The terms also imply a direction - the complement 
of 5'-ACAGT-3' is 5'-ACTGT~3' rather than 5 ! -TGTCA-3'. 

20 Nucleic acids of the invention can be used, for example: to produce polypeptides; as hybridization 
probes for the detection of nucleic acid in biological samples; to generate additional copies of the 
nucleic acids; to generate ribozymes or antisense oligonucleotides; as single-stranded DNA primers 
or probes; or as triple-strand forming oligonucleotides. 

The invention provides a process for producing nucleic acid of the invention, wherein the nucleic 
25 acid is synthesised in part or in whole using chemical means. 

The invention provides vectors comprising nucleotide sequences of the invention {e.g. cloning or 
expression vectors) and host cells transformed with such vectors. 

..... TheJ^vention also.-provides .a.kit comprising primers (e.g. RCR-primers)^for ampUfying-a -template 

sequence contained within a streptococcus bacterium (e.g. GBS) nucleic acid sequence, the kit 
30 comprising a first primer and a second primer, wherein the first primer is substantially 
complementary to said template sequence and the second primer is substantially complementary to a 
complement of said template sequence, wherein the parts of said primers which have substantial 
complementarity define the termini of the template sequence to be amplified. The first primer and/or 
the second primer may include a detectable label {e.g. a fluorescent label). 

35 The invention also provides a kit comprising first and second single-stranded oligonucleotides which 
allow amplification of a streptococcal template nucleic acid sequence contained in a single- or 
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dotfbie^stf^^ Mxfore thereof), wherein: (a) the first oligonucleotide comprises a 

primer sequence which is substantially complementary to said template nucleic acid sequence; 
(b) the second oligonucleotide comprises a primer sequence which is substantially complementary to 
the complement of said template nucleic acid sequence; (c) the first oligonucleotide and/or the 
second oligonucleotide comprise(s) sequence which is not complementary to said template nucleic 
acid; and (d) said primer sequences define the termini of the template sequence to be amplified. The 
non-complementary sequence(s) of feature (c) are preferably upstream of (i.e. 5' to) the primer 
sequences. One or both of these (c) sequences may comprise a restriction site [e.g. ref. 29] or a 
promoter sequence [e.g. 30], The first oligonucleotide and/or the second oligonucleotide may include 
a detectable label (e.g. a fluorescent label). 

The invention provides a process for detecting nucleic acid of the invention, comprising the steps of: 
(a) contacting a nucleic probe according to the invention with a biological sample under hybridising 
conditions to form duplexes; and (b) detecting said duplexes. 

The invention provides a process for detecting GBS in a biological sample (e.g. blood), comprising 
the step of contacting nucleic acid according to the invention with the biological sample under 
hybridising conditions. The process may involve nucleic acid amplification (e.g. PGR, SDA, SSSR, 
LCR, TMA, NASBA, etc.) or hybridisation (e.g. microarrays, blots, hybridisation with a probe in 
solution etc.). PCR detection of GBS in clinical samples has been reported [e.g. see refs. 31 to 34]. 
Clinical assays based on nucleic acid are described in general in ref. 35. 

The invention provides a process for preparing a fragment of a target sequence, wherein the fragment 
is prepared by extension of a nucleic acid primer. The target sequence and/or the primer are nucleic 
acids of the invention. The primer extension reaction may involve nucleic acid amplification (e.g. 
PCR, SDA, SSSR, LCR, TMA, NASBA, etc.). 

Nucleic acid amplification according to the invention may be quantitative and/or real-time. 

For certain embodiments of the invention, nucleic acids are preferably at least 7 nucleotides in length 
(e.g. 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 
34, 35, 36, 37, 38, 39, 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 
180, 190, 200, 225, 250, 275, 300 nucleotides or longer). 

For certain embodiments of the invention, nucleic acids are preferably at most 500 nucleotides in 
length (e.g. 450, 400, 350, 300, 250, 200, 150, 140, 130, 120, 110, 100, 90, 80, 75, 70, 65, 60, 55, 50, 
45, 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15 
nucleotides or shorter). 

Primers and probes of the invention, and other nucleic acids used for hybridization, are preferably 
between 10 and 30 nucleotides in length (e.g. 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 
25, 26, 27, 28, 29, or 30 nucleotides). 
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Pnarmaceuncai compositions 

The invention provides compositions comprising: (a) polypeptide, antibody, and/or nucleic acid of 
the invention; and (b) a pharmaceutically acceptable carrier. These compositions may be suitable as 
immunogenic compositions, for instance, or as diagnostic reagents, or as vaccines. Vaccines 
5 according to the invention may either be prophylactic (i.e. to prevent infection) or therapeutic (i.e. to 
treat infection), but will typically be prophylactic. 

A 'pharmaceutically acceptable carrier' includes any carrier that does not itself induce the production 
of antibodies harmful to the individual receiving the composition. Suitable carriers are typically 
large, slowly metabolised macromolecules such as proteins, polysaccharides, polylactic acids, 

10 polyglycolic acids, polymeric amino acids, amino acid copolymers, sucrose, trehalose, lactose, and 
lipid aggregates (such as oil droplets or liposomes). Such carriers are well known to those of ordinary 
skill in the art. The vaccines may also contain diluents, such as water, saline, glycerol, etc. 
Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, 
and the like, may be present. Sterile pyrogen-free, phosphate-buffered physiologic saline is a typical 

15 carrier. A thorough discussion of pharmaceutically acceptable excipients is available in ref. 155. 

Compositions of the invention may include an antimicrobial, particularly if packaged in a multiple 
dose format. 

Compositions of the invention may comprise detergent e.g. a Tween (polysorbate), such as Tween 
80. Detergents are generally present at low levels e.g. <0.01%. 

20 Compositions of the invention may include sodium salts (e.g. sodium chloride) to give tonicity. A 
concentration of 1 0±2mg/ml NaCl is typical. 

Compositions of the invention will generally include a buffer. A phosphate buffer is typical. 

Compositions of the invention may comprise a sugar alcohol (e.g. mannitol) or a disaccharide (e.g. 
sucrose or trehalose) e.g. at around 15-30mg/ml (e.g. 25 mg/ml), particularly if they are to be 
25 lyophilised or if they include material which has been reconstituted from lyophilised material. The 
pH of a composition for lyophilisation may be adjusted to around 6.1 prior to lyophilisation. 

Polypeptides of the invention may be administered in conjunction with other immunoregulatory 
agents. In particular, compositionsVill usually include a vaccine adjuvant Adjuvants s wM'chlnay be 
used in compositions of the invention include, but are not limited to: 

30 A. Mineral-containing compositions 

Mineral containing compositions suitable for use as adjuvants in the invention include mineral salts, 
such as aluminium salts and calcium salts. The invention includes mineral salts such as hydroxides 
(e.g. oxyhydroxides), phosphates (e.g. hydroxyphosphates, orthophosphates), sulphates, etc. [e.g. see 
chapters 8 & 9 of ref. 36], or mixtures of different mineral compounds (e.g. a mixture of a phosphate 

35 and a hydroxide adjuvant, optionally with an excess of the phosphate), with the compounds taking 
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any r sttitab!e fHtf t^T^sMMb, amorphous, etc.), and with adsorption to the salt(s) being 
preferred. Mineral containing compositions may also be formulated as a particle of metal salt [37]. 

Aluminum salts may be included in vaccines of the invention such that the dose of Al 3+ is between 
0.2 and 1.0 mg per dose. 

5 A typical aluminium phosphate adjuvant is amorphous aluminium hydroxyphosphate with PO4/AI 
molar ratio between 0.84 and 0.92, included at 0.6mg Al 3+ /ml. Adsorption with a low dose of 
alunainium phosphate may be used e.g. between 50 and lOOjiig Al 3+ per conjugate per dose. Where an 
aluminium phosphate it used and it is desired not to adsorb an antigen to the adjuvant, this is 
favoured by including free phosphate ions in solution (e.g. by the use of a phosphate buffer). 

10 B. Oil Emulsions 

Oil emulsion compositions suitable for use as adjuvants in the invention include squalene-water 
emulsions, such as MF59 (5% Squalene, 0.5% Tween 80, and 0.5% Span 85, fomiulated into 
submicron particles using a microfluidizer) [Chapter 10 of ref. 36; see also refs. 38-40]. MF59 is 
used as the adjuvant in the FLUAD™ influenza virus trivalent subunit vaccine. 

15 Particularly preferred adjuvants for use in the compositions are submicron oil-in-water emulsions. 
Preferred submicron oil-in-water emulsions for use herein are squalene/water emulsions optionally 
containing varying amounts of MTP-PE, such as a submicron oil-in-water emulsion containing 4-5% 
w/v squalene, 0.25-1.0% w/v Tween 80 (polyoxyethylenesorbitan monooleate), and/or 0.25-1.0% 
Span 85 (sorbitan trioleate), and, optionally, N-acetylmuramyl-L-alanyl-D-isogluatminyl-L-alanine- 

20 2-(r-2 ! -dipalmitoyl-sn-glycero-3-hydroxyphosphophoryloxy)-ethylamine (MTP-PE). Submicron 
oil-in-water emulsions, methods of making the same and immunostimulating agents, such as 
muramyl peptides, for use in the compositions, are described in detail in references 38 & 41-42. 

Complete Freund's adjuvant (CFA) and incomplete Freund's adjuvant (IF A) may also be used as 
adjuvants in the invention. 

25 C Saponin formulations [chapter 22 of ref 36] 

Saponin formulations may also be used as adjuvants in the invention. Saponins are a heterologous 
group of sterol glycosides and triterpenoid glycosides that are found in the bark, leaves, stems, roots 
and even -flowers of a wide range of plant-species. Saponins isolated from the bark of ' the Quillaja 
saponaria Molina tree have been widely studied as adjuvants. Saponin can also be commercially 

30 obtained from Smilax ornata (sarsaprilla), Gypsophilla paniculata (brides veil), and Saponaria 
officianalis (soap root). Saponin adjuvant formulations include purified formulations, such as QS21, 
as well as lipid formulations, such as ISCOMs. 

Saponin compositions have been purified using HPLC and RP-HPLC. Specific purified fractions 
using these techniques have been identified, including QS7, QS17, QS18, QS21, QH-A, QH-B and 
35 QH-C. Preferably, the saponin is QS21. A method of production of QS21 is disclosed in ref. 43. 
Saponin formulations may also comprise a sterol, such as cholesterol [44]. 
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Uoffi'blinatiohsl^oF '^apbfcMS '"*ia :; 5hblesterols can be used to form unique particles called 
immunostimulating complexes (ISCOMs) [chapter 23 of ref. 36]. ISCOMs typically also include a 
phospholipid such as phosphatidylethanolamine or phosphatidylcholine. Any known saponin can be 
used in ISCOMs. Preferably, the ISCOM includes one or more of QuilA, QHA and QHC. ISCOMs 
5 are further described in refs. 44-46. Optionally, the ISCOMs may be devoid of additional 
detergent(s) [47]. 

A review of the development of saponin based adjuvants can be found in refs. 48 & 49. 

D. Virosomes and virus-like particles 

Virosomes and virus-like particles (VLPs) can also be used as adjuvants in the invention. These 
10 structures generally contain one or more proteins from a virus optionally combined or formulated 
with a phospholipid. They are generally non-pathogenic, non-replicating and generally do not contain 
any of the native viral genome. The viral proteins may be recombinantly produced or isolated from 
whole viruses. These viral proteins suitable for use in virosomes or VLPs include proteins derived 
from influenza virus (such as HA or NA), Hepatitis B virus (such as core or capsid proteins), 
15 Hepatitis E virus, measles virus, Sindbis virus, Rotavirus, Foot-and-Mouth Disease virus, Retrovirus, 
Norwalk virus, human Papilloma virus, HIV, RNA-phages, Qfl-phage (such as coat proteins), GA- 
phage, fr-phage, AP205 phage, and Ty (such as retrotransposon Ty protein pi). VLPs are discussed 
further in refs. 50-55. Virosomes are discussed further in, for example, ref. 56 

E. Bacterial or microbial derivatives 

20 Adjuvants suitable for use in the invention include bacterial or microbial derivatives such as 
non-toxic derivatives of enterobacterial lipopolysaccharide (LPS), Lipid A derivatives, 
immunostimulatory oligonucleotides and ADP-ribosylating toxins and detoxified derivatives thereof. 

Non-toxic derivatives of LPS include monophosphoryl lipid A (MPL) and 3-O-deacylated MPL 
(3dMPL). 3dMPL is a mixture of 3 de-O-acylated monophosphoryl lipid A with 4, 5 or 6 acylated 
25 chains. A preferred "small particle" form of 3 De-O-acylated monophosphoryl lipid A is disclosed in 
ref. 57. Such "small particles" of 3dMPL are small enough to be sterile filtered through a 0.22jj,m 
membrane [57]. Other non-toxic LPS derivatives include monophosphoryl lipid A mimics, such as 
aminoalkyl glucosaminide phosphate derivatives e.g. RC-529 [58,59]. 

Lipid A derivatives include derivatives of lipid A from Escherichia coli such as OM-174. OM-174 is 
30 described for example in refs. 60 & 61 . 

Immunostimulatory oligonucleotides suitable for use as adjuvants in the invention include nucleotide 
sequences containing a CpG motif (a dinucleotide sequence containing an unmethylated cytosine 
linked by a phosphate bond to a guanosine). Double-stranded RNAs and oligonucleotides containing 
palindromic or poly(dG) sequences have also been shown to be immunostimulatory. 

35 The CpG's can include nucleotide modifications/analogs such as phosphorothioate modifications and 
can be double-stranded or single-stranded. References 62, 63 and 64 disclose possible analog 
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suB'stfhltibfis ^ M replac€m^ with 2'-deoxy-7-deazaguanosine. The adjuvant effect of 

CpG oligonucleotides is further discussed in refs. 65-70. 

The CpG sequence may be directed to TLR9, such as the motif GTCGTT or TTCGTT [71]. The 
CpG sequence may be specific for inducing a Thl immune response, such as a CpG-A ODN, or it 
5 may be more specific for inducing a B cell response, such a CpG-B ODN. CpG- A and CpG-B ODNs 
are discussed in refs. 72-74. Preferably, the CpG is a CpG-A ODN. 

Preferably, the CpG oligonucleotide is constructed so that the 5' end is accessible for receptor 
recognition. Optionally, two CpG oligonucleotide sequences may be attached at their 3 1 ends to form 
"immunomers". See, for example, refs. 71 & 75-77. 

10 Bacterial ADP-ribosylating toxins and detoxified derivatives thereof may be used as adjuvants in the 
invention. Preferably, the protein is derived from E.coli (E.coli heat labile enterotoxin "LT"), cholera 
("CT"), or pertussis ("PT"). The use of detoxified ADP-ribosylating toxins as mucosal adjuvants is 
described in ref. 78 and as parenteral adjuvants in ref. 79. The toxin or toxoid is preferably in the 
form of a holotoxin, comprising both A and B subunits. Preferably, the A subunit contains a 

15 detoxifying mutation; preferably the B subunit is not mutated. Preferably, the adjuvant is a detoxified 
LT mutant such as LT-K63, LT-R72, and LT-G192. The use of ADP-ribosylating toxins and 
detoxified derivatives thereof, particularly LT-K63 and LT-R72, as adjuvants can be found in refs. 
80-87. Numerical reference for amino acid substitutions is preferably based on the alignments of the 
A and B subunits of ADP-ribosylating toxins set forth in ref 88, specifically incorporated herein by 

20 reference in its entirety. 

F. Human immunomodulators 

Human immunomodulators suitable for use as adjuvants in the invention include cytokines, such as 
interleukins (e.g. IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12 [89], etc.) [90], interferons (e.g. interferon- 
y), macrophage colony stimulating factor, and tumor necrosis factor. 

25 G. Bioadhesives and Mucoadhesives 

Bioadhesives and mucoadhesives may also be used as adjuvants in the invention. Suitable 
bioadhesives include esterified hyaluronic acid microspheres [91] or mucoadhesives such as 
cross-linked jderiyatives. of pQly(acrylic acid), polyvinyl alcohol, polyvinyl pyrollidone, 
polysaccharides and carboxymethylcellulose. Chitosan and derivatives thereof may also be used as 

30 adjuvants in the invention [92]. 

H. Microparticles 

Microparticles may also be used as adjuvants in the invention. Microparticles (i.e. a particle of 
-lOOnm to ~150pm in diameter, more preferably ~200nm to -30pm in diameter, and most preferably 
~500nm to -lOjom in diameter) formed from materials that are biodegradable and non-toxic (e.g. a 
35 poly(a-hydroxy acid), a polyhydroxybutyric acid, a polyorthoester, a polyanhydride, a 
polycaprolactone, etc.), with poly(lactide-co-glycolide) are preferred, optionally treated to have a 
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negMVeiy~i^ SDS) or a positively-charged surface (e.g. with a cationic 

detergent, such as CTAB). 

L Liposomes (Chapters 13 & 14 of ref. 36) 

Examples of liposome formulations suitable for use as adjuvants are described in refs. 93-95. 

5 J. Polyoxy ethylene ether and polyoxyethylene ester formulations 

Adjuvants suitable for use in the invention include polyoxyethylene ethers and polyoxyethylene 
esters [96]. Such formulations further include polyoxyethylene sorbitan ester surfactants in 
combination with an octoxynol [97] as well as polyoxyethylene alkyl ethers or ester surfactants in 
combination with at least one additional non-ionic surfactant such as an octoxynol [98]. Preferred 
10 polyoxyethylene ethers are selected from the following group: polyoxyethylene-9-lauryl ether 
(laureth 9), polyoxyethylene-9-steoryl ether, polyoxytheylene-8-steoryl ether, polyoxyethylene-4- 
lauryl ether, polyoxyethylene-35-lauryl ether, and polyoxyethylene-23-lauryl ether. 

K. Polyphosphazene (PCPP) 

PCPP formulations are described, for example, in refs. 99 and 100. 
15 L. Muramyl peptides 

Examples of muramyl peptides suitable for use as adjuvants in the invention include N-acetyl- 
muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor- 
MDP), and N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-( 1 '^'-dipalmitoyl-sn-glycero-S - 
hydroxyphosphoryloxy)-ethylamine MTP-PE). 

20 M. Imidazoquinolone Compounds. 

Examples of imidazoquinolone compounds suitable for use adjuvants in the invention include 
Imiquamod and its homologues (e,g. "Resiquimod 3M"), described further in refs. 101 and 102. 

N. TJiiosemicarbazone Compounds. 

Examples of thiosemicarbazone compounds, as well as methods of formulating, manufacturing, and 
25 screening for compounds all suitable for use as adjuvants in the invention include those described in 
ref. 103. The thiosemicarbazones are particularly effective in the stimulation of human peripheral 
blood mononuclear cells for the production of cytokines, such as TNF-a. 

O. Tryptanthrin Compounds. 

Examples of tryptanthrin compounds, as well as methods of formulating, manufacturing, and 
30 screening for compounds all suitable for use as adjuvants in the invention include those described in 
ref. 104. The tryptantibrin . compounds are particularly effective in the stimulation of human 
peripheral blood mononuclear cells for the production of cytokines, such as TNF-a. 

The invention may also comprise combinations of aspects of one or more of the adjuvants identified 
above. For example, the following combinations may be used as adjuvant compositions in the 
35 invention: (1) a saponin and an oil-in-water emulsion [105]; (2) a saponin (e.g. QS21) + a non-toxic 
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LPS derivative 'Ce.g TdfaffiH^;^) a saponin (e.g. QS21) + a non-toxic LPS derivative (e.g. 
3dMPL) + a cholesterol; (4) a saponin (e.g. QS21) + 3dMPL + IL-12 (optionally + a sterol) [107]; 
(5) combinations of 3dMPL with, for example, QS21 and/or oil-in-water emulsions [108]; (6) SAF, 
containing 10% squalane, 0.4% Tween 80™, 5% pluronic-block polymer L121, and thr-MDP, either 
5 microfluidized into a submicron emulsion or vortexed to generate a larger particle size emulsion. (7) 
Ribi™ adjuvant system (RAS), (Ribi Immxinochem) containing 2% squalene, 0.2% Tween 80, and 
one or more bacterial cell wall components from the group consisting of monophosphorylipid A 
(MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS), preferably MPL + CWS 
(Detox™); (8) one or more mineral salts (such as an aluminum salt) + a non-toxic derivative of LPS 
10 (such as 3dMPL); and (9) one or more mineral salts (such as an aluminum salt) + an 
immunostimulatory oligonucleotide (such as a nucleotide sequence including a CpG motif). 

Other substances that act as immunostimulating agents are disclosed in chapter 7 of ref. 36. 

The use of an aluminium hydroxide or aluminium phosphate adjuvant is particularly preferred, and 
antigens are generally adsorbed to these salts. Calcium phosphate is another preferred adjuvant. 

1 5 The pH of compositions of the invention is preferably between 6 and 8, preferably about 7. Stable pH 
may be maintained by the use of a buffer. Where a composition comprises an aluminium hydroxide 
salt, it is preferred to use a histidine buffer [109]. The composition may be sterile and/or 
pyrogen-free. Compositions of the invention may be isotonic with respect to humans. 

Compositions may be presented in vials, or they may be presented in ready-filled syringes. The 
20 syringes may be supplied with or without needles. A syringe will include a single dose of the 
composition, whereas a vial may include a single dose or multiple doses. Injectable compositions 
will usually be liquid solutions or suspensions. Alternatively, they may be presented in solid form 
(e.g. freeze-dried) for solution or suspension in liquid vehicles prior to injection. 

Compositions of the invention may be packaged in unit dose form or in multiple dose form. For 
25 multiple dose forms, vials are preferred to pre-filled syringes. Effective dosage volumes can be 
routinely established, but a typical human dose of the composition for injection has a volume of 
0.5ml. 

-Where a composition of the invention is to be prepared extemporaneously prior to use (e.g. where a 
component is presented in lyophilised form) and is presented as a kit, the kit may comprise two vials, 
30 or it may comprise one ready-filled syringe and one vial, with the contents of the syringe being used 
to reactivate the contents of the vial prior to injection. 

Immunogenic compositions used as vaccines comprise an immunologically effective amount of 
antigen(s), as well as any other components, as needed. By 'immunologically effective amount 5 , it is 
meant that the administration of that amount to an individual, either in a single dose or as part of a 
35 series, is effective for treatment or prevention. This amount varies depending upon the health and 
physical condition of the individual to be treated, age, the taxonomic group of individual to be treated 
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(e.g. rion-h^ primate, etc.), the capacity of the individual's immune system to synthesise 

antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctors 
assessment of the medical situation, and other relevant factors. It is expected that the amount will fall 
in a relatively broad range that can be determined through routine trials, and a typical quantity of 
5 each meningococcal saccharide antigen per dose is between lju,g and lOmg per antigen. 

Pharmaceutical uses 

The invention also provides a method of treating a patient, comprising administering to the patient a 
therapeutically effective amount of a composition of the invention. The patient may either be at risk 
from the disease themselves or may be a pregnant woman ('maternal immunisation' [110]). 

10 The invention provides nucleic acid, polypeptide, or antibody of the invention for use as 
medicaments (e.g. as immunogenic compositions or as vaccines) or as diagnostic reagents. It also 
provides the use of nucleic acid, polypeptide, or antibody of the invention in the manufacture of: (i) a 
medicament for treating or preventing disease and/or infection caused by GBS; (ii) a diagnostic 
reagent for detecting the presence of GBS or of antibodies raised against GBS; and/or (iii) a reagent 

15 which can raise antibodies against GBS. Said GBS can be of any serotype or strain. Said disease may 
be, for instance, bacteremia, meningitis, puerperal fever, scarlet fever, erysipelas, pharyngitis, 
impetigo, necrotising fasciitis, myositis or toxic shock syndrome. 

The patient is preferably a human. Where the vaccine is for prophylactic use, the human is preferably 
an adolescent (e.g. aged between 10 and 20 years); where the vaccine is for therapeutic use, the 
20 human is preferably an adult. A vaccine intended for children or adolescents may also be 
administered to adults e.g. to assess safety, dosage, immunogenicity, etc. 

One way of checking efficacy of therapeutic treatment involves monitoring GBS infection after 
administration of the composition of the invention. One way of checking efficacy of prophylactic 
treatment involves monitoring immune responses against an administered polypeptide after 

25 administration. Immunogenicity of compositions of the invention can be determined by 
administering them to test subjects (e.g. children 12-16 months age, or animal models e.g. a mouse 
model) and then determining standard parameters including ELISA titres (GMT) of IgG. These 
immune responses will generally be determined around 4 weeks after administration of the 
" composition, and compared to values determined before administration of the composition. Where 

30 more than one dose of the composition is administered, more than one post-administration 
determination may be made. A mouse neonatal sepsis model for protective efficacy against GBS 
infection is known e.g. see ref. 111. 

Administration of polypeptide antigens is a preferred method of treatment for inducing immunity. 
Administration of antibodies of the invention is another preferred method of treatment. This method 
35 of passive immunisation is particularly useful for newborn children or for pregnant women. This 
method will typically use monoclonal antibodies, which will be humanised or fully human. 
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Preferred compositions for use iii immunisation include more than one GBS polypeptide. Multiple 
antigens can be included as separate admixed polypeptides in a single composition, and/or can be 
part of a hybrid polypeptide as described above. Preferred combinations of antigens include at least 
one (e.g. 1 5 2, 3, 4, 5, 6 or more) 'core' polypeptide (as described below; Table V) and at least one 
5 (e.g. 1,2,3, 4, 5, 6 or more) 'variable 5 polypeptide (as described below; Table VI). Mixtures of one 
core polypeptide with more than one variable polypeptides are preferred. Examples of these 
combinations, using the nomenclature of reference 2, include (a) GBS322 (a core antigen) plus 
GBS80, GBS104 & GBS67 (all variable antigens); and (b) GBS322 plus GBS80 & GBS104. In 
some embodiments, this specific 3-valent combination [112] and this specific 4-valent combination 
10 [113] are excluded from the invention, although they illustrate the principle of combining core and 
variable antigens. 

Compositions of the invention will generally be administered directly to a patient. Direct delivery 
may be accomplished by parenteral injection (e.g. subcutaneously, intraperitoneally, intravenously, 
intramuscularly, or to the interstitial space of a tissue), or by rectal, oral, vaginal, topical, 
15 transdermal, intranasal, sublingual, ocular, aural, pulmonary or other mucosal administration. 
Intramuscular administration to the thigh or the upper arm is preferred. Injection may be via a needle 
(e.g. a hypodermic needle), but needle-free injection may alternatively be used. A typical 
intramuscular dose is 0.5 ml. 

The invention may be used to elicit systemic and/or mucosal immunity. 

20 Dosage treatment can be a single dose schedule or a multiple dose schedule. Multiple doses may be 
used in a primary immunisation schedule and/or in a booster immunisation schedule, A primary dose 
schedule may be followed by a booster dose schedule. Suitable timing between priming doses (e.g. 
between 4-16 weeks), and between priming and boosting, can be routinely determined. 

Bacterial infections affect various areas of the body and so compositions may be prepared in various 
25 forms. For example, the compositions may be prepared as injectables, either as liquid solutions or 
suspensions. Solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection 
can also be prepared (e.g. a lyophilised composition). The composition may be prepared for topical 
administration e.g. as an ointment, cream or powder. The composition be prepared for oral 
administration e.g. as a tablet or capsule, or as a syrup (optionally flavoured). The composition may 
30 be prepared for pulmonary administration e.g. as an inhaler, using a fine powder or a spray. The 
composition may be prepared as a suppository or pessary. The composition may be prepared for 
nasal, aural or ocular administration e.g. as spray, drops, gel or powder [e.g. refs 1 14 & 1 15]. 

Further antigenic components of compositions of the invention 

The invention also provides a composition comprising a polypeptide or the invention and one or 
35 more of the following further antigens: 
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- *a ^cbKari-ae'-MfigSri ffflflT ] Krftefflfigitidis serogroup A, C, W135 and/or Y (preferably all 

four), such as the oligosaccharide disclosed in ref. 1 16 from serogroup C [see also ref. 1 17] or 
the oligosaccharides of ref 118. 

- a saccharide antigen from Streptococcus pneumoniae [e.g. 1 19, 120, 121], 
5 - an antigen from hepatitis A virus, such as inactivated virus [e.g. 122, 123]. 

- an antigen from hepatitis B virus, such as the surface and/or core antigens [e.g. 123, 124]. 

- a diphtheria antigen, such as a diphtheria toxoid [e.g. chapter 3 of ref. 125] e.g. the CRM 197 
mutant [e.g. 126]. 

~ a tetanus antigen, such as a tetanus toxoid [e.g. chapter 4 of ref. 125]. 
10 - an antigen from Bordetella pertussis, such as pertussis holotoxin (PT) and filamentous 
haemagglutinin (FHA) from B. pertussis, optionally also in combination with pertactin and/or 
agglutinogens 2 and 3 [e.g. refs. 127 & 128]. 

- a saccharide antigen from Haemophilus influenzae B [e.g. 117]. 

- polio antigen(s) [e.g. 129, 130] such as IPV. 

15 - measles, mumps and/or rubella antigens [e.g. chapters 9, 10 & 1 1 of ref. 125]. 

- influenza antigen(s) [e.g. chapter 19 of ref. 125], such as the haemagglutinin and/or 
neuraminidase surface proteins. 

- an antigen from Moraxella catarrhalis [e.g. 131]. 

- a saccharide antigen from Streptococcus agalactiae (group B streptococcus). 

20 - an antigen from Streptococcus pyogenes (group A streptococcus) [e.g. 132, 133, 134], 

- an antigen from Staphylococcus aureus [e.g. 135]. 

The composition may comprise one or more of these further antigens. 

In another embodiment, the GBS antigens of the invention are combined with one or more additional, 
non-GBS antigens suitable for use in a vaccine designed to protect elderly or immunocompromised 
25 individuals. For example, the GBS antigens may be combined with an antigen derived from the group 
consisting of Enterococcus faecalis, Staphylococcus aureus, Staphylococcus epidermis, Pseudomonas 
aeruginosa, Legionella pneumophila, Listeria monocytogenes, Neisseria meningitides, influenza, and 
Parainfluenza virus (TIV 5 ). 

Toxic protein antigens may be detoxified where necessary (e.g. detoxification of pertussis toxin by 
30 chemical and/or genetic means [128]). 

Where a diphtheria antigen is included in the composition it is preferred also to include tetanus 
antigen and pertussis antigens. Similarly, where a tetanus antigen is included it is preferred also to 
include diphtheria and pertussis antigens. Similarly, where a pertussis antigen is included it is 
preferred also to include diphtheria and tetanus antigens. DTP combinations are thus preferred. 

35 Saccharide antigens are preferably in the form of conjugates. Carrier proteins for the conjugates 
include bacterial toxins (such as diphtheria toxoid or tetanus toxoid), the N. meningitidis outer 
membrane protein [136], synthetic peptides [137,138], heat shock proteins [139,140], pertussis 

-19- 



WO 2006/069200 



PCT/US2005/046491 



proteins [141,142], protein D from H.influenzae [143,144], cytokines [145], lymphokines [145], H. 
influenzae proteins, hormones [145], growth factors [145], toxin A or B from C.difficile [146], iron- 
uptake proteins [147], artificial proteins comprising multiple human CD4+ T cell epitopes from 
various pathogen-derived antigens [148] such as the N19 protein [149], pneumococcal surface 
5 protein PspA [150], pneumolysin [151], etc. A preferred carrier protein is the CRM197 protein [152]. 

Antigens in the composition will typically be present at a concentration of at least l|ag/ml each. In 
general, the concentration of any given antigen will be sufficient to elicit an immune response against 
that antigen. 

As an alternative to using proteins antigens in the immunogenic compositions of the invention, 
1 0 nucleic acid (preferably DNA e.g. in the form of a plasmid) encoding the antigen may be used. 

Antigens are preferably adsorbed to an aluminium salt. 
Screening methods 

The invention provides a process for deterrnining whether a test compound binds to a polypeptide of 
the invention. If a test compound binds to a polypeptide of the invention and this binding inhibits the 

1 5 life cycle of the GBS bacterium, then the test compound can be used as an antibiotic or as a lead 
compound for the design of antibiotics. The process will typically comprise the steps of contacting a 
test compound with a polypeptide of the invention, and determining whether the test compound binds 
to said polypeptide. Preferred polypeptides of the invention for use in these processes are enzymes 
{e.g. tRNA synthetases), membrane transporters and ribosomal polypeptides. Suitable test 

20 compounds include polypeptides, polypeptides, carbohydrates, lipids, nucleic acids {e.g. DNA, RNA, 
and modified forms thereof), as well as small organic compounds {e.g. MW between 200 and 2000 
Da). The test compounds may be provided individually, but will typically be part of a library {e.g. a 
combinatorial library). Methods for detecting a binding interaction include NMR, filter-binding 
assays, gel-retardation assays, displacement assays, surface plasmon resonance, reverse two-hybrid 

25 etc. A compound which binds to a polypeptide of the invention can be tested for antibiotic activity by 
contacting the compound with GBS bacteria and then monitoring for inhibition of growth. The 
invention also provides a compound identified using these methods. 

Preferably, the process comprises the steps of: (a) contacting a polypeptide of the invention with one 
or more candidate compounds to give a mixture; (b) incubating the mixture to allow polypeptide and 
30 the candidate compound(s) to interact; and (c) assessing whether the candidate compound binds to 
the polypeptide or modulates its activity. 

Once a candidate compound has been identified in vitro as a compound that binds to a polypeptide of 
the invention then it may be desirable to perform further experiments to confirm the in vivo function 
of the compound in inhibiting bacterial growth and/or survival. Thus the method comprise the further 
35 step of contacting the compound with a GBS bacterium and assessing its effect. 
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jl ne puiypepuae'Tisea'itt me scre'enffig-process may be free in solution, affixed to a solid support, 
located on a cell surface or located intracellularly. Preferably, the binding of a candidate compound 
to the polypeptide is detected by means of a label directly or indirectly associated with the candidate 
compound. The label may be a fluorophore, radioisotope, or other detectable label. 

Preferred polypeptides for use in these screening methods are the 'core' sequences identified below. 
General 

The invention provides a computer-readable medium (e.g. a floppy disk, a hard disk, a CD-ROM, a 
DVD etc.) and/or a computer memory and/or a computer database containing one or more of the 
sequences in the sequence listing. 

The term "comprising" encompasses "including" as well as "consisting" e.g. a composition 
"comprising" X may consist exclusively of X or may include something additional e.g. X + Y. 

The term "about" in relation to a numerical value x means, for example, xfl0%. 

The word "substantially 5 ' does not exclude "completely" e.g. a composition which is "substantially 
free" from Y may be completely free from Y. Where necessary, tbe word "substantially" may be 
omitted from the definition of the invention. 

The N-terminus residues in the amino acid sequences in the sequence listing are given as the amino 
acid encoded by the first codon in the corresponding nucleotide sequence. Where the first codon is 
not ATG, it will be understood that it will be translated as methionine when the codon is a start 
codon, but will be translated as the indicated non-Met amino acid when the sequence is at the 
C-terminus of a fusion partner. The invention specifically discloses and encompasses each of the 
amino acid sequences of the sequence listing having a N-terminus methionine residue (e.g. a 
formyl-methionine residue) in place of any indicated non-Met residue. It also specifically discloses 
and encompasses each of the amino acid sequences of the sequence listing starting at any internal 
methionine residues in the sequences. 

As indicated in the above text, nucleic acids and polypeptides of the invention may include 
sequences that: 

(a) are identical (i.e. 100% identical) to the sequences disclosed in the sequence listing; 

(b) share sequence identity with the sequences disclosed in the sequence listing; 

(c) have 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 single nucleotide or amino acid alterations (deletions, 
insertions, substitutions), which may be at separate locations or may be contiguous, as 
compared to the sequences of (a) or (b); and 

(d) when aligned with a particular sequence from the sequence listing using a pairwise alignment 
algorithm, a moving window of x monomers (amino acids or nucleotides) moving from start 
(N-terminus or 5') to end (C-terminus of 3 1 ), such that for an alignment that extends to p 
monomers (where p>x) there are p-x+1 such windows, each window has at least xy identical 
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aligned monom^ from 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 

200; y is selected from 0.50, 0.60, 0.70, 0.75, 0.80, 0.85, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 
0.97, 0.98, 0.99; and if xy is not an integer then it is rounded up to the nearest integer. The 
preferred pairwise alignment algorithm is the Needleman-Wunsch global alignment algorithm 
[153], using default parameters (e.g. with Gap opening penalty = 10.0, and with Gap extension 
penalty = 0.5, using the EBLOSUM62 scoring matrix). This algorithm is conveniently 
implemented in the needle tool in the EMBOSS package [154]. 

The nucleic acids and polypeptides of the invention may additionally have further sequences to the 
N-teiminus/5 f and/or C-terminus/3 1 of these sequences (a) to (d). 

The practice of the present invention will employ, unless otherwise indicated, conventional methods 
of chemistry, biochemistry, molecular biology, immunology and pharmacology, within the skill of 
the art. Such techniques are explained fully in the literature. See, e.g., references 155-162, etc. 

BRIEF DESCRIPTION OF DRAWINGS 

There are no drawings. 

MODES FOR CARRYING OUT THE INVENTION 

Genome sequencing has been carried out on five strains of GBS from different serotypes: '18RS2r 
(type II; MLST type ST19), '515' (type la; MLST type ST23), 'CJBlll' (type V; MLST type ST1), 
'COH1' (type III; MLST type ST17) and 'H36B' (type lb; MLST type ST6). Different numbers of 
coding sequences were identified in the five genomes: 



Strain 


18RS21 


515 


CJB111 


COH1 


H36B 


Coding seqs 


2151 


2249 


2167 


2410 


2393 



20 These 11370 coding sequences are given in the sequence listing together with their inferred 
translation products. Annotation of these polypeptide sequences is given in Table I. 

The sequence listing gives sequences in pairs, such that an odd-numbered sequence 'n' is a DNA 
coding sequence and the even-numbered sequence c n+l ? is the corresponding amino acid sequence: 



Strain 


18RS21 


515 


CJB111 


COH1 


H36B 


SEQ ID NO s 


1-4302 


4303-8800 


8801-13134 


13135-17954 


17955-22740 



25 The polypeptides and their epitopes can be used as antigens e.g. in vaccines or diagnostic tests. 

Homologous coding sequences between strains are shown in Table II (listing SEQ ID numbers). For 
comparison, Table II also includes the c gi' (Genlnfo Identifier) accession numbers for strains 
2603V/R (serotype V; MLST type ST106) [1] and NEM316 (serotype III; MLST type ST23) [3]. A 
single row in Table II includes all homologs and, where applicable, paralogs within a single strain. 

30 In contrast to Table II, coding sequences without homologs in any of the other six sequenced 
genomes (i.e. unique to one strain within the six strains) are listed in Table III. These are preferred 
coding sequences of the invention e.g. when strain-specificity is desired. Each of the seven 
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sequenced genomes contains Between"! 3 and 61 sequences not present in any of the other strains. 
This variability exceeds that seen in the comparative genome hybridization analysis of reference 1. 

Table IV lists coding sequences in the five new sequenced genomes that do not have any homologs 
in strains 2603V/R [1] or NEM316 [3]. These are preferred coding sequences of the invention 
5 e.g. when sequences not known in the prior art are desired. 

Table V lists 'core 5 GBS genes, namely those that are found in all seven sequenced genomes. These 
'universal' GBS coding sequences are preferred for use with the invention e.g. when strain- 
specificity is not desired, such as when designing a diagnostic test with high inter-strain 
cross-reactivity, or when preparing a composition which will elicit antibodies with high inter-strain 
10 cross-reactivity, or when screening for broad-range anti-GBS antibiotics. Table VI lists variable GBS 
genes, namely those that are found in at least two sequenced genomes, but not in all seven. The 
format of Tables V and VI follows that of Table II. 

The GBS "pan-genome" can thus be divided in three parts: a core-genome, strain-specific sequences, 
and "dispensable genes" shared only by some of the strains. The core genes describe the basic 
15 aspects of GBS biology and major phenotypic traits, whereas dispensable and strain-specific genes 
contribute to the observed genetic diversity of the species and might confer selective advantages, 
such as adaptation to different niches, antibiotic resistance, and increased invasive capabilities. 

The vast majority of genes making up the core genome belong to the groups of housekeeping 
functions, cell envelope, regulatory functions, and transport and binding proteins. However, about 

20 one third of the shared genes fall into the annotation class of hypothetical proteins and proteins of 
unknown function, thus suggesting that many aspects of basic GBS biology still need to be explored. 
Because of their 'core' nature, however, these sequences still have utility as they can be used in 
situations where inter-strain cross-reactivity is needed, without needing to know their true underlying 
biological function. Hypothetical genes and genes of unknown function are much more represented 

25 among the dispensable genes, probably due to the fact that more functions have been ascribed to 
better known {i.e. more frequently found) genes. This view is also supported by the strain-specific 
genes being predominantly of unknown function. Furthermore, genes associated with mobile and 
extrachromosomal elements are particularly abundant in this group, supporting the hypothesis that 
the majority of specific traits depend upon phenomena of lateral gene transfer. On the other hand, 

30 this class of genes is very poorly represented within the core genome, indicating that only a few of 
these rearrangements have remained stable during evolution of GBS. 

The core shared by all isolates (Table V) accounts for only about 80% of any single genome, with the 
remaining 20% being absent in at least one other strain (Table VI). Approximately 1800 coding 
sequences are shared by the sequenced GBS strains. The criteria for gene identity between genomes 
35 was set low so that coding sequences were considered shared even if they were quite divergent in 
sequence. The size of the core is thus likely to be an overestimated, but it substantially defines the 
basic characteristics of the GBS species. As further GBS genome sequences become available then 
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this "core" may decrease (by analogy^ coding sequence would move from Table V to Table VI), 
but for the purposes of the present invention the "core" is the group given in Table V. Even using the 
sequences herein, the core decreases with the addition of each new genome, but extrapolation of the 
curve indicates that the core stabilizes at around 1800 coding sequences and will remain constant 
5 even as many more genomes are added. 

One mechanism by which bacteria can modulate their lifestyle and virulence in response to variable 
stimuli, stress conditions and adaptation to different niches is phase variation [163,164]. Such 
variation occurs by altering the length of short repeated DNA tracts within or immediately upstream 
of coding regions (contingency genes), thus causing frame-shifts and affecting protein synthesis. At 
10 least one important virulence-associated gene in GBS is regulated in this way [165], and so 
identification of further phase variable genes can identify new virulence factors. Virulence factors are 
particularly useful for vaccination, antibiotic targets, etc. Table VII shows such phase variable genes, 
and these are preferred polypeptides for use with the invention. 

It will be understood that the invention has been described by way of example only and modifications 
1 5 may be made whilst remaining within the scope and spirit of the invention. 
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TABLE II — 'fiomologs and paralogs 



2603vr 


18rs21 


515 


cjb111 


cohl 


h36b 


NEM316 


22535220 


1,3919 


8615 


12909 


14061 


22383 


9441 3708 


22535221 


3 


8617 


12911 


14063 


22385 22427 


9441 370Q 


22535222 


5 


8619, 8703 


12913 


14065 


22387 


9441 371 0 


22535223 


7 


8621 


12915 


14067 


22389 


24413711 


22535224 


9 


8623 


12917 


14069 


22391 


24413712 


22535225 


11 


8625 


12919 


14071 


22393 ' 


24413713 


22533003 


13 




19Q91 


1 AA70 




^oU94427 


22533004 


15 


4303 8fi9Q 


19Q93 




9O0Q7 


^oU944^o 


22533005 


17 


430R 8631 


19Q95 19Q97 




OOQQQ 

^oyy 


oonn A A OO 

^o094429 


22533006 225 
35061 


19 3631 


4307 8305 
8323 8325 
8327 8329 
8787 




1 AC\7Q 1 7971 


OO/lAI 


OOAO/i yl OO 

2o094430 


22533007 


21 


4309 


9607 


14081 


22403 


230Q4431 


22533008 


23 


4311 


9609 


14083 


22405 


930Q4439 


22533009 


25 


4313 


9611 


14085 


22407 


23094433 


22533010 


27, 29 


4315 


9613 


14087 


9940Q 99fifi3 


930QzLd.OA 


22533012 


31 


4317 


9615 


14089 


22413 


23094435 


22533013 


33 


4319 


9617 


140Q1 


99415 


90AQ/t/10£ 

^ouy44oo 


22533014 


35 


4321 


9619 


14093 


22417 


23094437 


22533015 


37 


4323 


Qfi91 


to too, iH-uyo 




OOAQ/I/IOQ 

^oUy44oo 


22533016 225 
34732 


39 


4395 


QROQ 19091 


1*31 07 1AAQ7 
1fifi81 

l DUO I 


91AQQ 00/101 


^oUy44oy 


22533017 


41 


4327 


Q625 


1313Q 13143 
140QQ 


99490 


OQf\QAAAf\ 

£OUy444U 


22533018 


43 


4329 


9627 


13145 13147 

1 O 1 *TO , IO It/, 

13149 14101 

1 w 1 i C j It 1 U 1 


99495 99R93 


900QzLlz!.1 


22533020 


45, 47, 49, 51 


4331 


8801 


13181 


17Qfi3 




22533021 


53 


4333, 4335 


8803 


13183 


17965 


23094443 


22533022 


55 


4337 


8805 


13185 


17QR7 
i / yo/ 


^OUy4444 


22533023 


57 


4339, 4341 


8807 


13187, 13189 


17969 


23094445 


22533024 


59 


4343 


8809 


13191 


17971 


23094446 


22533025 


61 


4345 


8811 


1^1 Crt 


17Q7Q 

i /y/ o 


OOAQ/1 AA~7 

<s:oUy444/ 


22533026 


63 


4347 


8813,8815 ! 


13195 


17975 


23094448 


22533027 


65 


4349 


881 7 


131Q7 


"1 7Q77 1 QAOO 
i /y/ / , loUOO, 

18035 


OOAA/l A ACl 

^oUy444y 


22533028 


67 


4351 


8819 


131QQ 13945 
13247 17811 
17821,17859 


17Q7Q 18007 
i / y /y, i ouo/ , 

99555 


OOAQ/1 ylKA 
/ioUy440U 


22533029 


69, 71,3965, 
4017, 4061 


4353 


8821 


13201 17803 
17891 


17Q81 

1 f v70 1 


930044m 


22533030 


73, 75, 3963 


4355 


8823 


13203 


17Q83 
i / yoo 


90AQA4C;O 


22533032 


77 


4357, 4365 


8825 


13205 


17Q85 


930Q4453 


22533033 


79 




8827 


13207 


17Q87 

1 / v/O / 


930Q4454 


22533034 


81 


4361,4369, 
8707 j 


8829 


13209 


17989, 22505 


23094455 


22533035 


83, 85 


4371 


8831 


13211,13249, 
13251,13253 


17991 


23094456 


22533036 


87 


4373 


8833 


13213 


17993 


23094457 


22533037 


89 


4375 


8835 


13215, 17895 


17995 


23094458 


22533038 


91 


4377, 4413, 
4415, 4417 


8837 


13217, 13255, 
17893 


17997 


23094459 
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&r.i""ll ft t 1 — H — it""" 

2(503vr- L 


"'h 11 JL "'"it ■'1 n "iNI 


" "515 


cjb111 


com 


nooD 


iMcivio 1 0 


22533039 


93 


4379 


8839 


13219, 13257 


A 7AQA 

i / yyy 




22533040 


95 


4381,4419, 
4421 


8841 


13221, 13259 


A QAf^ 

I oUu 1 




22533041 


97 


4383 


8843 




lOUUO 


2^00.4469 


22533042 


99 


4385 


8845 


a oooc 


lOUUO 




22533043 


101 


4387 


8847 


13227 


18007 


23094464 


22533044 


103 


4389 


8849 


13229 


A QAAQ 




22533045 


105 


4391 


8851 


13231 


a OA-t H 

18011 


/£OUy4400 


22533046 


107 


4393 


8853 


13233 


18013 


23094467 


22533047 


109 


4395 


8855 


13235, 17875, 
17941 


a f\f\A r* A rtAAA 

18015, 18039, 
18041, 22bol 


OOAO/1 ACQ 


22533048 


111 


4397 


8857, 13023 


13237 


18017 


2ouy44oy 


22533049 


113, 115,117, 
3949 


4399 


8859 


13239 


a OAH A 

18019 




22533050 


119 


- 


8861 








22533051 


121 


4405 


8863 


13265 


A QAAQ 




22533053 


123 


4407 


8865 


13267 






22533054 


125 


4409 


8867 


13269 


18027 


23094475 


22533055 


127 


4411 


8869 


13271 


A OAAA 


OQfiQAA7£ 


22533056 


129 


4423 


8871 


13273 


a Qrtn4 

loUol 


9QO.QAA77 


22533057 


131 


4425, 4427 


8873 


13275 


A OA/IO 

1804o 


OQAQAA7P. 
^OUy44/0 


22533058 I 


133 


4429,4431, 
4433 


8875 


13277, 13279 


18045, 1oU4/\ 
loU4y, loUol 


OQAQAA7Q 


22533059 


135 


4435, 4437 


8877 




loUDo 




22533060 


137 


4439 


8879 


17759 


A QAEE 




22533061 


139 


4441 


8881 




1QfiK7 
loUO/ 




22533062 


141 




8883 




18059 


23094483 


22533063 


143 


4443 


8885 


13285 


a QACH 


OQHQAAP.A 


22533064 


145 


4445 


8887 


13287, 13289 


18063, 18071 


OQAQ/IAftR 
^OUy4'fc50 


22533065 


147 


4447 


8889 


13291 


a e\ f\r» f ACS ATA 

18065, 18073 


iioUy44oo 


22533066 


149 


4449 


8891 


13293 


18067, 18069, 

H QA7E 


OQAQA/1P.7 
ZoUy44o / 


22533067 


151 


4451 


8893 


13295 


18077 


23094488 


22533068 


153 


4453 


8895 


13297 


A O A~7A 

18079 


itiouy44oy 


22533069 


155 


4455 


8897 


13299 


a OAOH 

18081 


OQAQAAQH 

^ouy44yu 


22533070 


157 


4457 


8899 


13301 


a QAQO 


^ovjy*f 'i-y i 


22533071 


159 


4461 


8903 


a r\ a r* r* A OOOC 

13155, 13305, 

A 7Anr 

179^0 


a QAQ7 




22533072 


161 


4463 


8905 


13307 


18089 


23094493 


22533073 


163 


4465 


8907 


A ooAn 

13309 


^ QACH 


OQfiQAAQA 


22533074 


165 


4467 


8909 


a nnA a 

13311 


A QAAQ 


O^HQAAQ^ 

couyT-H-ao 


22533075 


167 


4469 


8911 


13313 


A QAAE 

louyo 




22533076,225 
34790 


169,3205 


6299 
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TABLE III — Unique coding sequences 



2603V/R 


22533205, 22533568, 22533571 , 22533574, 22533576, 22533597, 22533706, 
22533776, 22533967, 22534356, 22534614, 22534650, 22534875, 22534876, 
22534877 22534880 22534882 22534885 22534886 
22534891, 22534892, 22534894, 22534895, 22534898, 22534900, 22534901, 
22534903, 22534906, 22534907, 22534908, 22534911, 22534912, 22534915, 
22534916, 22534917, 22534919, 22534924, 22535035, 22535159, 22535160, 
22535163, 22535164, 22535165, 22535166, 22535167 


18RS21 


1 749, 3623, 3981 , 3985, 3989, 3991 , 3999, 4057, 4059, 4081 , 4203, 4205, 4207 


515 


4359, 4367, 4803, 4805, 4807, 4809, 481 1 , 4813, 5289, 5295, 5493, 5497, 5499, 
5505, 5507, 5511, 5513, 5515, 5563, 6887, 8267, 8269, 8647, 8683, 8685, 8695, 
8725, 8727, 8783, 8789, 8797 




Q91S Q?37 QP'Vl 1 nDfT 1 ; 1flfi81 IflftDQ 19ft9Q 19097 10QAQ 

12951, 12967, 12997, 13043, 13047, 13087, 13091, 13093, 13105 


COH1 


13153, 13157, 13159, 13241, 13243, 13263, 15187, 15201, 15227, 15819, 15821, 
15823, 15825, 15827, 15829, 16015, 16019, 16021, 16023, 16539, 16561, 16565, 
17319, 17321, 17693, 17705, 17707, 17719, 17753, 17785, 17797, 17819, 17897, 
17921,17933 


H36B 


18691, 19065, 19067, 19071, 19073, 19075, 19085, 19087, 19089, 19091, 19093, 
19095, 19099, 19103, 19111, 19113, 19115, 19117, 19119, 19123, 19125, 19127, 
19129, 19131, 19133, 19135, 19139, 19141, 19143, 19145, 19149, 19165, 20099, 
20401 , 22529, 22531 , 22533, 22535, 22545, 22547, 22557, 22559, 22561 , 22565, 
22571 , 22585, 22589, 22621 , 22641 , 22667, 22671 , 22679, 22695, 22699, 22705, 
22715, 22717, 22721 , 22723, 22725, 22733 


NEM316 


23094662, 23094664, 23094667, 23094668, 23094669, 23094670, 23094794, 
23094796, 23094797, 23094798, 23094799, 23094802, 23094803, 23094806, 
23094808, 23094809, 23094810, 23094811, 23094812, 23094813, 23094814, 
23094815, 23094816, 23094818, 23094820, 23094821 , 23094822, 23094823, 
23094824, 23094825, 23094827, 23094828, 23094829, 23094830, 23094831, 
23094832, 23094833, 23094835, 23095107, 23095109, 23095110, 23095111, 
230951 12, 230951 15, 230951 16, 230951 19, 23095121 , 23095122, 23095123, 
23095124, 23095125, 23095126, 23095127, 23095128, 23095129, 23095131, 
23095133, 23095134, 23095135, 23095136, 23095137, 23095138, 23095140, 
23095141, 23095142, 23095143, 23095144, 23095145, 23095146, 23095148, 
23095423, 23095425, 23095426, 23095427, 23095428, 23095429, 23095430, 
23095431, 23095433, 23095434, . 23095435, 23095.436, 23095437, 23095438, 
23095440, 23095442, 23095443, 23095444, 23095445, 23095446, 23095447, 
23095448, 23095449, 23095450, 23095452, 23095455, 23095456, 23095459, 
23095460, 23095461 , 23095462, 23095463, 23095569, 23095570, 23095571, 
23095572, 23095573, 23095574, 23095575, 23095576, 23095577, 23095578, 
23095579, 23095580, 23095581 , 23095582, 23095583, 23095584, 23095585, 
, 23095586, 23095587, 23095614, 23095615, 23095617, 23095623, 23095624, 
23095626, 23095627, 24412909, 24412910, 24412911, 24412912, 24412921, 
24412922, 24412923, 24412924, 24413555 
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18RS21 


283, 1123, 1127, 1129, 1273, 1275, 1731, 2321, 2367, 3839, 3841, 3843, 3845, 
3917, 4047, 4099, 4131, 4139, 4141, 4159, 4181, 4219, 4221, 4223, 4225, 4227, 
4229, 4231 , 4233, 4235, 4237, 4239, 4241 , 4243, 4245, 4247, 4249, 4251 , 4253, 
4255, 4263, 4265, 4267, 4269, 4271 , 4273, 4275, 4277, 4287 


515 


4459, 4479, 4601, 4793, 4797, 4801, 5285, 5291, 5293, 5559, 5561, 5565, 5567, 
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_ 






17619 


22279 




_ 


_ 


_ 




17605 


22269 




_ 




_ 


12551 




22467 






_ 




12149 




21599,22457 






_ 




12147 




22455 : 




m 






12145 




22453 ! 




m 


_ 




12139 




22447 




_ 


_ 




12137 




22445 




- 


_ 




12133 


m 


22441 






- 




12131 




22439 




- 


_ 


m 


12129 




22437 










12127 


_ 


22435 




_ 






12125 


m 


22433 




- 


- 


_ 


• 12119 


_ 


21589 




- 






12117 


_ 


21587 




- 


- 


_ 


10831 




20109 




- 


- 




10827 


m 


20105 




- 




_ 


9897 


m 


20367 






- 




10799 


15375 












12135 




"22443 








m 


12141 




22449 




- 


_ 


m 


12143 


_ 


22451 




- 




„ 


12481 


17143,17947 


21903 




- 






12605 


17389 


22031 




- 


- 


_ 


12817 


17629 


22509 




- 


- 


_ 


12929 




18873 










13045 


15817 














17213 


21951 












17615 


22275 












17617 


22277 












17621 


22281 
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TABJLM VII -UTVfcfP'MiiaHi coding' sequences 



Variable gene 1 ' 


Location of repeat 


Position of repeat K } 


JSJbQ JLD NO: 1821 


5 end 


145 


JLD NO: 1861 


5' end 


2 


bliQ JLD NO: 957 


5 1 end 


116 


blsQ ID NO: 1941 


5' end 


13 


SJbQ ID NO: 2141 


5' end 


11 


bJbQ ID NO: 2863 


5' end 


21 


bBQ ID NO: 2887 


5 f end 


295 


CT?A TT"\ \Trv. T1 A1 

SJbQ ID NO: 3191 


5* end 


18 


oiiQ JLD NO: 2447 


5' end 


3 


DJbvj ID NO: 3775 


5 f end 


301 


OJj/V^ JUL/ IN ^ . O / / ^ 


D ena 


/o 


SEQ ID NO: 3723 


Middle 


1120 


SEQIDNO: 2313 


3' end 


3185 


SEQ ID NO: 719 


Promoter 


40 


SEQ ID NO: 4631 


Promoter 


103 


SEQ ID NO: 2373 


Promoter 


1 



Given for one strain only; Table II can be used 
to find any homologs in other strains. 

(2) relative to ATG 
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CLAIMS 

I . An isolated polypeptide comprising an amino acid sequence which has at least 75% sequence identity 
to one or more of the even-numbered amino acid sequences selected from the group consisting of 
SEQIDNOS:2-22740. 

5 2. The polypeptide of claim 1, comprising one or more of the even-numbered amino acid sequences 
selected from the group consisting of SEQ ID NOS:2-22740. 

3. An isolated polypeptide comprising a fragment of at least 7 consecutive amino acids from one or 
more of the even-numbered amino acid sequences selected from the group consisting of SEQ ID 
NOS:2-22740. 

10 4. The polypeptide of claim 3, wherein the fragment comprises a T-cell or a B-cell epitope from an 
even-numbered amino acid sequence selected from the group consisting of SEQ ID NOS:2-22740. 

5. An antibody which binds to the polypeptide of any preceding claim. 

6. The antibody of claim 5 which is monoclonal. 

7. An isolated nucleic acid comprising a nucleotide sequence which has at least 75% sequence identity 
15 to one or more of the odd-numbered nucleotide sequences selected from the group consisting of SEQ 

IDNOS:l-22739. 

8. The nucleic acid of claim 7, comprising a nucleotide sequence which is an odd-numbered nucleotide 
sequence selected from the group consisting of SEQ ID NOS: 1-22739. 

9. An isolated nucleic acid which can hybridize to the nucleic acid of claim 8 under high stringency 
20 conditions. 

10. An isolated nucleic acid comprising a fragment of 10 or more consecutive nucleotides from one or 
more of the odd-numbered nucleotide sequences selected from the group consisting of SEQ ID 
NOS: 1-22739. 

II. An isolated nucleic acid encoding the polypeptide of any one of claims 1 to 4. 

25 12. A composition comprising: (a) polypeptide, antibody, and/or nucleic acid of any preceding claim; and 
(b) a pharmaceutically acceptable carrier. 

13. The composition of claim 12, further comprising a vaccine adjuvant. 

14. The nucleic acid, polypeptide, or antibody of any one of claims 1 to 1 1 for use as a medicament. 

15. A method of treating a-patient, comprising administering to the patient a therapeutically effective 
3 0 amount of the composition of claim 12. 

16. Use of the nucleic acid, polypeptide, or antibody of any one of claims 1 to 1 1 in the manufacture of a 
medicament for treating or preventing disease and/or infection caused by GBS. 
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